Asymptotic Peak utilisation in Heterogeneous Parallel CPU/GPU Pipelines: a Decentralised Queue Monitoring Strategy
نویسندگان
چکیده
Heterogeneous parallel computing has become an unavoidable consequence of the emergence of GeneralPurpose computing on graphics processing units (GPGPU). The characteristics of a Graphics Processing Unit (GPU)—including significant memory transfer latency and complex performance characteristics—demand new approaches to ensuring that all available computational resources are geared towards optimal utilisation. This paper considers the simple case of a divisible workload based on widely-used numerical linear algebra routines and considers the challenges that present themselves when an attempt is made to efficiently use all resources available with a view in balancing the CPU and GPU utilisation. We suggest a possible queue monitoring strategy that facilitates resource usage for applications that fit the pipeline parallel architectural pattern on heterogeneous multicore/multi-node CPU and GPU systems.
منابع مشابه
Achieving Native GPU Performance for Out-of-Card Large Dense Matrix Multiplication
In this paper, we illustrate the possibility of developing strategies to carry out matrix computations on heterogeneous platforms which achieve native GPU performance on very large data sizes up to the capacity of the CPU memory. More specifically, we present a dense matrix multiplication strategy on a heterogeneous platform, specifically tailored for the case when the input is too large to fit...
متن کاملCPU + GPU scheduling with asymptotic profiling
Hybrid systems with CPU and GPU have become new standard in high performance computing. Workload can be split and distributed to CPU and GPU to utilize them for data-parallelism in hybrid systems. But it is challenging to manually split and distribute the workload between CPU and GPU since the performance of GPU is sensitive to the workload it received. Therefore, current dynamic schedulers bal...
متن کاملA hybrid computing method of SpMV on CPU-GPU heterogeneous computing systems
Sparsematrix–vectormultiplication (SpMV) is an important issue in scientific computing and engineering applications. The performance of SpMV can be improved using parallel computing. The implementation and optimization of SpMV on GPU are research hotspots. Due to some irregularities of sparse matrices, the use of a single compression format is not satisfactory. The hybrid storage format can exp...
متن کاملImplementation of the direction of arrival estimation algorithms by means of GPU-parallel processing in the Kuda environment (Research Article)
Direction-of-arrival (DOA) estimation of audio signals is critical in different areas, including electronic war, sonar, etc. The beamforming methods like Minimum Variance Distortionless Response (MVDR), Delay-and-Sum (DAS), and subspace-based Multiple Signal Classification (MUSIC) are the most known DOA estimation techniques. The mentioned methods have high computational complexity. Hence using...
متن کاملA novel cooperative accelerated parallel two-list algorithm for solving the subset-sum problem on a hybrid CPU-GPU cluster
Many parallel algorithms have recently been developed to accelerate solving the subset-sum problem on a heterogeneous CPU–GPU system. However, within each compute node, only one CPU core is used to control one GPU and all the remaining CPU cores are in idle state, which leads to a large number of CPU cores being wasted. In this paper, based on a cost-optimal parallel two-list algorithm, we prop...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Parallel Processing Letters
دوره 22 شماره
صفحات -
تاریخ انتشار 2012